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Abstract 



This paper desoibes the curroit evaluadtm practices employed by Scientific Work 
Experience Rograms (SWEPs) across the ctMintry. A survey on current practices in evaluation 
was administeied to SWEP program directors in 1995 and the results are suminariaBd Results 
were analyzed to determine die degree commonality across SWEPs in their evaluation 

purposes, contexts, and strategies, and the degree of coisensus on important goals and 
objectives across the projects. The audiors conclude that there is sufficient comnxmality in 
these areas to support a multi>site collaborative evaluation effort Issues relating to the design 
of a collabmadve evaluadoi strategy are discussed. 



This paper is considered a "waking paper" and the authors would welcome comments, 
suggestions, and correctitms. Comments may be directed to Kathryn Sloane, College of 
Education, University of Illinois, 1310 S. Sixth Street Charr^aign, IL 61820 (office phone: 
217-333-8530; Internet address: ksloane@uiuc.edu) or Judy Young 
(jyoung@cello.gina.calstate.edu). 
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Evaluation of Scientific Work Experience Programs for Teachers: 

Current Practice and Future Directions 

Kathryn Sloane and Jody Young 
INTRODUCnON 

At the 1994 National Confinenoe of Sdentific Woric Experience I^ograms (SWEP), 
piogiam evahiatkm emerged as the ”toinc of greatest concern to program nianagers and 
funders'' (Conference Rqxxt, 1995). 'fiiere were 'lively discussions" during the Conference 
sessimts oa program evaluatkm tofncs and "many unanswered questions" following reptxts on 
current evaluation strategies (by local SWEPs or by National Center for Inaproving Science 
Education (NOSE)). There seemed to be a sense cf firustratimi among dre conference 
particq)ants: local program evaloadtxi is a prxtfessional re^>onsibiliQr and a political necessity— 
but the projects are conaplex* the outcomes difficult to measure, and time and resources for 
evaluation ate slira There was a strong desire to share ideas and strategies, and a renewed 
discussion of the idea of a collaborative multi-site evaluatkxi effort 

In respcmse to the discussion at the Conference, nSME agreed to cmnmission a "white 
pq)er" to pull togedier some of the issues and concerns in SWEP evaluation and (perhaps) to 
propose some future directions in local or national evaluatkm. As a first stq) in achieving this 
aim, the authors surveyed SWEP program managers to gain a clearer picture of "the current 
ffta t e of affairs": What are the evaluation requirements at the local level? What are the important 
project objectives and which of these arc, and are not, being evaluated? What types of 
evaluation strafegies are currendy in place? What are die most pressing concerns with respect 
to local evaluation? 

In this paper, we summarize and discuss the results of diis survey with three purposes 
in mind. The first is to provide information to die SWEP community on current practices in 
evaluation. There seems to be a great deal of interest in "what others are doing" and in whetha 
other projects are struggling with the same issues. The second purpose is to determine if there 
is enough cnmmnn ground (in project purposes, evaluation requirements, and existing 
strategies) to proceed with plans for a national evaluation, and/or the devdc^iment of a 
"common set of procedures" that local projects mi^t use. Finally, we offer some suggestions 
on ways the survey and the survey results might guide furdier discussions of local or national 
evaluation strategies, and some methods that might be considered in such efforts. 



THE SURVEY 



In July, 1995, wc sent a survey to all of the 75 SWEPs listed in the latest SWEP 
directory. The survey contained qi»sti(ms about current evaluation requirements and 
strategies, as well as a list of project goals and objectives drat we asked respondents to rate on 
inqrortance. We received full responses fiom 35 project directors. Of die remaining 40, we 
learned that dx projects are now defunct, that many others were not fiiUy ”up and running" 

(i.e., only had 2 or 3 teachers in summer positions as of yet), and that some project directors 
didn't want to conqilete the survey because diey felt their projects were too small or too new. 
There were a few projects diat, to die best of our knowledge, were well-established in 1993, 
but we were uruible to elicit responses firom diose projects. Our best estimate is that the 
"potential population" for this survey was about 50 projects, ^ving us a return rate of 70%. 

SURVEY RESULTS 

The full survey and tallies of the item responses fOTdie total group are presented in 
Appendix A. The items were also analyzed for differences among subgroups of projects, with 
subgroups defined along the following dimensions^: 

T^pe: Industry-based (n=17) or Research-based (TRAC or University Research Lab) 
(n=16) or "other" (n=2) 

Size: Large (more than 15 teachers) (n»17) or Small (n=16) 

Age: Mature (more dian 5 years old) (n=26) or Young (n»7) 

While there were some differences on specific items, in general the responses were remarkably 
consistent across all of the subgroups. Therefore, we discuss the results for the group at large, 
noting areas in which there were noteworthy differences across subgroups. 

Wbn Evaluates and Whv? 

Nearly all projects have some formal evaluadtxi cooqxment: 27 respondents reported 
ccxiducting formal program evaluadcxis, and an addidonal 3 projects have evaluadtm strategies 
"under develqxnent" Evaluation is required in 24 (58%) of the projects, but the remaining 1 1 
projects have no formal requirement 

We posed a series of quesdons to try to discern the primary purposes of existing 
evaluations. These quesdons were: By whom are you required to do evaluation? What do they 



^The cross tabulations are as follows: Research: small and young (6); large and mature (8); small and matwe 
(2). Industry: small and mature (7); large and mature (7); small and young (1). 
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want to know? Who would read an evaluation report if you wrote it? What are the priwity 
ratings of different potoitial purposes of evaluation? How do you cuirmtly use die evaluation 
data you ctdlect? The responses are reported in detail in Appendix A and are s umm a riz ed 
briefly in the following. 

By wham are evaluations required? Not surprisingly, the most frequent response to 
this question was "the funding agency." Grants from the National Science Fbundadon (NSF), 
the National Institute of Healdi (NIH) , and other federal and private agencies mandate some 
form of evaluation, at least of the project activities funded by die grant The TRAC program 
contains an evaluation conqxxient (mana^d by Associated Western Universides, as well as 
recent work by NOSE) required the Department of Energy. Local governing boards 
(Boards of Directors, self-governing councils) also require program evaluations at many sites. 
Two respondents named an outside evaluator as the one requiring evaluation; but presumably, 
those evaluators were hired (or mandated) by a funding a^it or a local govoning board. 

What do they want to know? The most frequent (n=9) categ^iry of reiqionse to this 
questitxi was attainment nf goals. This category includes broad statements such as, "how well 
we meet our goals," "program effectiveness," or "return on die dollar." In these broad 
statemoits, "goals and objectives" (xr "effectiveness with reqpect to what" were not defined and 
p w»<qimahl y jn chide impte ny>ntatif> n gnats /iiidtor desired inyact Other responses could 
be categorized more specifically. Six (6) responses contained an explicit eriqihasis on 
implementatinn nf project activities, such as, "evaluation (tf how each ctxnponent of the grant 
has been carried out," "ratings of aspects of the program (e.g., availalnlity of resources, 
assistance by staff, relatkHiships with mentors)", "lab activities and enrichment activities," or 
"parameters of teachers' research experiences." Some responses focused on ^lecific types of 
outcomes. The most frequent, by far, (n^S) was tojchi^ftiitenmea. defined as changes in 
attitudes and behavicnrs, changes in {diilosophy of education and teaching styles, or retention in 
teaching careers. These were distinct from specific issues of classroom transfer, which were 
listed in only two (2) lespcmses. And interestingly, outcomes were mentitmed 

specifically in only three (3) responses (and one qualified it by saying "sometimes"). Another 
two (2) responses noted sponsor outcomes, such as sponstn* satisfaction or impact on mentors. 

What are the primary purposes of your evaluation? In this question, respondents were 
asked to rate a scries of purposes as "primary," "secondary," or "probably not a purpose." 
Consistent with the em phasis on a ttainme nt of goals and outcmnes as the infmmation desired 
by funders and governing boards, most program managers rated "monitor outcmnes of the 
existing program" as the top priority (34 out of 35 responses). "Use as justification for 
funding" ranked next, with 24 respondents rating this as a "primary purpose". Traditional uses 
of formative evaluation (fw ongoing program adjustment and for pilot-testing new activities or 
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strategies) ranked thiid, with a little over half of the program managers rating these purposes as 
"primary." Almost none of the program managers saw "comparing your SWEP to other 
programs" as a primary purpose; 14 coded this item as "not a purpose" and another 16 rated it a 
"secondary purpose" at best 

How do you currently use the dam you collect? This question was included to 
distinguish between the intended purposes of evaluation and the actual uses of evaluation data. 
About half (n=15) <rf the program managers cited formarivft (program in^provement) uses of the 
data and about one-third (n=10) described more summarive uses (judging effectiveness; 
decisions regarding continued funding). Repotting functions (e.g., annual reports to funders 
and ^nsors, journal articles, presentations to professional groups) were cited in nine (9) 
responses. Recniitmftnt and marketing was mentioned in four (4) responses (but was also 
listed as a possible "other use", in another survey question, an additional five times). 
Contrasting these responses with those to the previous question, it appears that there is a 
slighdy greater emphasis on "outcomes" in the intended purposes than in the actual uses. But 
on the whole, intents and uses of evaluation seem fairiy consistent (which in this evaluator's 
opinion is somewhat remarkable). 

Who would read an evaluation report? This question was designed to identify the 
potential audiences for the evaluation. Three primary audiences were listed most frequently: 1 ) 
funding agencies (n=21); 2) project staff and management (n=15); and 3) the management and 
mentors at the local industry or research sites (n=15). Interestingly, teacher participants or 
school administrators were listed much less frequently (n=7 and n=5, respectively), about as 
often as academic colleagues (n=6). 

Commonalities in Evalua tion Contexts 

Based on the results reported above, the 35 responding SWEP sites have many 
commonalities in the contexts within which local evaluation occurs: For mal evaluations are 
required ftom "the top", with the clients and primary audiences of the evaluations being what 
might be termed the "upper level management" (funders, spcMisors, governing bodies) in the 
projects. There arc consistently high expectations (or at least desires) for proof of program 
effectiveness, regardless of whether the project is relatively new or well-established, or 
whether it serves a very small or very large number of teachers. Formal evaluation is mandated 
for accountability purposes (documenting implementation of funded activities, verifying 
attainment of stated goals, demonstrating program effectiveness), but project directors also 
have clear needs and uses of evaluation that go beyond the required accountability purposes 
(internal program adjustment and improvement, dissemination of information about the project, 
recruitment and marketing). Finally, the projects appear to share a predominantly "pre 



ordinate" or "goals-oriented" approach to evaluation, Le., evaluations focus on the question, 
"are pre-established goals and objectives being met in practice?". 

ITiis consistency in the contexts for evaluation bodes well for future efforts to 
collaborate on evaluation studies: it suggests that projects share similar constraints, concerns, 
and expectatuMis regarding the purposes of evaluation— which is probably a necessary condition 
for collaborative effcats in this area. An even more important condition, however, is that the 
content of the evaluation is consistent, or at least conqratible. In other words, are the projects 
consistent in their views of the important goals and objectives they are trying to attain? We 
turn next to that question, in our analyses of the ratings of program objectives. 

Priority Outcomes 

In this part of the survey, we included a "laundry list" of statements representing 
possible goals, objectives, or intended outcomes of the SWEP experience. To construct the 
list, we gathered documents (Ix’ochurcs, reports, evaluation instruments) from a number of 
SWEPs and included just about every statement we could find regarding goals or objectives. 
There was considerable overlap, of course, but we retained the various statements so we could 
see which ones resonated most with the largest number of respondents. The intent here was to 
determine if there was agreement on the "most important" objectives that might be assessed 
across projects, and if the wording of those objectives could provide direction in the 
development of specific instruments and strategics. The statements were grouped into six 
broad categwies: a) institutional and program support; b) program implementation; c) teacher 
effects; d) classroom effects; e) student outcomes; and f) school and community impact. Some 
categories have mcMe items than others; this is because there were more statements relating to 
these categories in the project materials we reviewed. 

Respondents were asked to rate each statement on a 5-point "priority" scale, with the 
following points defined: 

5 = Highest priority. Critical outcome of our program; program cannot be considered 
successful if this docs not occur for most teachers. 

3 = Moderate priority. Desired objective of our program; wo'ild hope this occurs for 
many teachers. 

1= Low priority. Would be "nice" if this occurred for some teachers. 

There was a tendency for respondents to rate naost of the items highly, which is not 
surprising given the sources of the statements. Of the 76 statements, over half (n=43) have 
mean ratings greater than 4.0. Items with relatively lower mean ratings tended to have larger 
standard deviations, indicating greater variability in the priority ratings assigned. The means 
and standard deviations for all of the items are reponed in Appendix A. 



In the following sections, we list and discuss those items in each category that were 
rated as having the "hipest priority" across projects. We have selected those items with mean 
ratings greater than 4.0. Within each category, we give ^>ecial attention to those itans that 
clearly stood out as important (mean ratings greater than 4.2 or 4.5) and for which there was 
relatively strong agreement (standard deviations less than 1.0) about their ingxjrtance. For 
contrast, in some categories we also mention tiiosc items that received notably low ratings, or 
those for which there was considerable disagreement across projects. 

InstimionallCorporate Support 

Ei^t (8) items were included in this category, focusing on mentor reactions and 
outcomes, and on changes in the institutions which support the projects. Of dicsc eight items, 
only three (3) had mean ratings greater than 4.0. These arc displayed in Table 1. 

Table 1 

Priority Outcomes for Institutional/Corporate Support 



CAT 


ITEM 


Mean 


SD 


IS 


A. Mentors feel that the program is worthwhile for teachers. 


TW~ 


0.77 


IS 


li Teachers successfully complete the task assigned them. 


TTT~ 


1.06 


IS 


B. Mentors feel that the program is worthwhile for 
themselves. 


4.1l 


0.71 










IS 


CATEGORY TOTALS 




1.19 



The priority items in this category basically reflect a focus on mentors’ satisfaction with 
the project and with their teachers' performances during the summer internships. Items that did 
not receive high ratings focused on broader outcomes, such as more institutional and Board 
support for education (mean rating about 2.7), or greater understanding (on the Mentor’s part) 
of teachers’ roles and responsibilities in schools (mean ratings about 3.6). The Category Total 
row lists the mean and standard deviation for all of the eight items in the category; the relatively 
low mean^ (3.58) indicates that most of the items in this category did not receive consistently 
high ratings across the projects. 



mean rating of 3.58 certainly indicates that goal statements had value to the project directors. But given the 
consistently high ratings across items, items with mean ratings below 4.0 stand out as notably less important 
than other items or categories and/or as items for which there was less consensus about their importance. 



Program Implementation 

Nine (9) items considered aspects of the project structure and implementation, i.e., 
project activities and elements used to screen, place, and support teachers during the summer 
experience. In Table 2, the items are ranked according to mean ratings of importance 



Table 2 

Priority Outcomes for Program Implementation^ 



TaT" 


ritM 


Mean 


SD 


Prog 


G. Teachers receive support for extending experience to classroom 


.4.66 


■ 534 ” 


Prog 


L Teachers will consider internship as a high level professional 
development program. 


'4.57 


0.90 


Prog 


B. Teachers adjusted well to the demands of internship. 


"430“ 


0.56 


Prog 


A. Screening process places teachers in best possible position. 


4.43 


0.85 


Prog 


D. Orientation and other program meetings will enhance internship 


TW 


0.78 


Prog 


F. Teachers receive advice and support for sharing experience. 


4.22 


0.95 


Prog 


E. Teachers are exposed to a variety of scientific & technical career 


s4.1^ 


1.01 


Prog 


C. Increased participation of teachers of underrepresented groups. 


4.00 


1.00 


Prog 


H. Mechanisms / academies are developed to continue dialogue 
after the internship. 


T64^ 


1.25 




CATEGORY TOTALS 


4.28 


0.93 



What is immediately notable is the high category mean, which indicates that nearly all of the 
items received ratings higher than 4.0. The first three items on the list were very high priorities 
in nearly all of the projects, with mean ratings above 4.5. One might argue that these items do 
represent the central most in^xjrtant features of the SWEP experience: teachers are successful 
in the industry/research setting; they see the program as relevant to their professional growth; 
and they receive support for translating the summer experience into classroom practice. 
Following close behind in ranking is the goal of placing teachers in the "best possible 
position"; this certainly is consistent with, and perhaps a prerequisite for, teachers adjusting 
well and finding the experience professionally rewarding^. There was fairly consistent 
agreement on the importance of meetings (Orientation and other meetings) enhancing the 

^ All of the items in the category arc included in Table 2, because there was only one item that did not meet the 
4.0 or greater standard. 

^ Establishing the criterion for this objective may not be straightforward, however, and as we will report in a 
later section of this paper, very few projects collet data to assess this goal. 
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internship (i.c., being a worthwhile use of limited time during the summer) and on teachers 
receiving advice and support for sharing their experience (although it is not clear if this means 
with their SWEP teacher colleagues, teachers at their school site, or some other target group). 
There was slightly less agreement on the irrrpartarx^e of recruiting teachers of underrepresented 
or minority student groups. 

The one item that did not receive consistently high ratings in this category related to 
follow up mechanisms-Academies or other tegular meetings during the school year to provide 
ongoing support for teachers after the summer experierree. For some projects Qikc the TRAC 
program), this is not a viable option since teachers are not "local"; for other projects (like 
nSME), the Academy structure is considered an integral part of the program model. 

Teacher Effects 

In this category, we included those statements that focus on the teachers' own 
knowledge, attitudes, and professional skills. Items related specifically to classroom transfer 
are included in the next category. The ranked results in this category are displayed in Table 3. 

Fourteen (14) of the 21 items haa mean ratings of 4.0 or higher. Again, there were 
three items that emerged as very higli priority, with mean ratings greater than 4.5. These, too, 
rcpnpcscnt central, defining features of the SWEP experience: gaining first hand knowledge of 
the industry or research culture; becoming credible models to students of the excitement of 
math and science; and gaiiting a renewed enthusiasm for teaching^. 

While still enjoying high ratings, the next set of items are somewhat less "vital" to the 
success of the projects. There are relatively lower ratings, and relatively more variability in 
assigned ratings, for items that represent specific "manifestations" of the broader goals in the 
top three items. For example, it is considered very inqwrtant, across projects, for teachers to 
gain knowledge of the culture and careers in the industry or research enviitMimcnt But it is 
considered a little less important for teachers to gain q)ccific knowledge of manufacturing or 
research processes, to know about specific post-secondary opportunities, to demonstrate gains 
in their knowledge of subject matter, or to increase their awareness of specific subject-to-work 
applications. 



^As an aside, verification of this last item has been an important issue in some projects. In nSM£, for 
example, there was great concern at the beginning of the program that teachers would be enticed to leave 
teaching for the world of industry. Consistent evidence across the years that the experience renews their 
commitment to teaching was an important point in explaining the program goals to potential sponsors. 
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Table 3 

Priority Outcomes for Teacher Effects 



Ay a result of a SWEP internship, teachers will 



CAT 




Mean 




Tchr 


A. gain first hand knowledge of industry/rescarch culture and 
careers. 


4.81 


0.47 


Tchr 


L. become credible model to students of excitement with 
math/science subjects. 


4.61 


0.^5 


Tchr 


K. demonstrate renewed enthusiasm for teaching. 


4.55 


0.86 


TcE" 


M have higher professional self esteem. 


4.43 


0.98 


Tchr 


N. be revitalized after the summer. 


4.37 


0.84 


Tchr 


B. be more knowledgeable of manufacturing or research 
processes. 


4.32 


0.84 


Tchr 


C). have new perspectives on education. 


4.31 


i.Ol 


Tchr 


I. develop activities to use in their classroom. 


4.30 


1.13 


Tchr 


D. be nwre knowledgeable in their subject area. 


'4W~ 


0.97 


Tchr 


C. increase awareness of specific subject to work application. 


4.19 


UW 


Tchr 


J. be more self confident in work-world skills. 


4.19 


0.98 


Tchr 


P. share experience with school personnel or community groups. 


4.19 


0.85 


Tchr 


E. know a larger number of post secondary opportunities for 
students. 


4.06 


0.83 


Tchr 


G. be more competent in the use of technology. 


To0“ 


0.79 


Tchr 


CATEGORY TOTAL 


4.05 


1.09 



OthCT sets of items show this pattern of greater agreement and higher value ratings on 
broader purpose statements, with less agreement and lower value ratings on specific examples 
of how that purpose might be manifested. For example, "share experience with school 
f>ersonncl or community groups" had a mean rating of 4.19; but, items that might indicate 
specific ways of sharing the experience (such as "conduct inservicc related to internship", or 
"become involved in school reform outside their own classroom") had much lower mean 
ratings (around 3.2) and much higher standard deviations (over 1.25). Also rating relatively 
less important were items relating to assuming new leadership roles in the school or distria, 
continuing with more professional development, and being retained in the teaching force. 
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Classroom Tranter 

This category had an equal number of items (n=21) to the Teacher Effects category, and 
represents the goals of translating the summa experience into classroom practices witich will, 
in turn, contribute to better learning and appreciation for math and science among students. 

The ranked items in this category are presented in Table 4. 



Table 4 

Priority Outcomes for Classroom Transfer 
As a result of a SWEP internship, teachers will 





HEM 


Mean 


"5D“ 


Class 


L. use applications & cxanq)ies from summer experience. 


4.68 


"04“ 


Class 


S. use more teamwork and cooperative learning with students. 


4.49 




Qass 


T. design & implement more hands-on lessons. 


4.38 


wir 


Class 


R. promote student investigation & inquiry. 


4.37 


0.77 


Class 


G. revise or add new content to lessons & labs. 


4.26 


0.86 


Class 


L integrate math, science and technology. 


4.12 


1.07 


Class 


J. provide more "business/real world" applications. 


4.li 


1.23 


Class 


U. act more as a facilitator than a lecturer. 


4.09 


1.09 


Class 


K. value and encourage better communication skills. 


TOS” 


1.15 


"Class 


CATEGORY TOTALS 


3.75 


1.3l 



The project directors agree that, as a result of the summer experience, teachers should 
modify their classroom practice to include more applications and exan^les of how math and 
science are used in industry and research. They agree that teachers should wwk to integrate 
math, science and technology, to promote investigation and inquiry (perhaps through more 
hands-on lessons), and to encourage communication skills. These items reflect a sense of what 
the projects are promoting as "desirable" classroom practice in math and science education. 
While these items are rated highly, there is mere variability in the item ratings, as indicated by 
the relatively larger standard deviations on at least five of the items. 

In this category, 12 of the 21 items had mean ratings less than 4.0. The pattern noted 
in the Teacher Effects category is very obvious in this category: there is agreement on the value 
of general principles of classroom transfer, but not on the specifics of how this should occur, 
or of what specifically should be expected of teachers in their classroom practice. For 
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example, while there is agreement that teachers should encourage communication skills, project 
directors are not, as a group, willing to assign priorities to teachers' requiring more oral reports 
or assigning m(KC written reports. To illustrate furdier, we include in Table 5 die list of items 
that had substantially lower ratings. 



Table 5 

Items with Lower Ratings on Classroom Transfer 



rmA 


Mean 




Q. increase use of computers & technology in their classroom. 


3.80 


0.96 


A. invite mentors and speakers to schools. 


T55" 


1.11 


H. include lessons on saence careers & job requirements. 


3.54 


1.38 


E. provide activities that strengthen school-industry partnerships. 


T4T“ 


T24“ 


B. take students on field trip to intemship site. 


3 46 


TOT" 


F. increase emphasis on work habits such as punctuahty, 
dependability, meeting deadlines, & professionalism. 


TW 


1.22 


D. receive materials or equipment fit>m lab or industry. 


T26~ 


i.ii 


C. take students on more field trips to industry and lab sites. 


3.11 


TW 


M. cover fewer topics but in more depth. 


3.09 


1.33 


N. retjuire more reports & presentations. 1 


3.09 


TI5" 


P. assipi long term joint projects. 


2.94 


1.28 


0. assign more formal written reports. 


TTT 


l.l2 









There is less priority given to "career education" and specific strategies for increasing students’ 
knowledge of careers, or to increasing links between the specific industry or research site and 
the school (sec items A, E, B, D, C^). Interestingly, the group emphasized improved 
communicaticMi skills, but assigned much lower ratings to improved "work habits" among 
students (such as punctuality, meeting deadlines, and the like). 



Student Outcomes 

While SWEP model is one of professional development and teacher enhancement, the 
students are, of course, the ultimate beneficiaries of improvements teachers may subsequently 
make in their classroom approach and instruction. Exactly how this "line of influence" is to 
occur is difficult to establish, however. And, there is ample recognition of the many factors 
that affect student attitudes, performance, and persistence in math and science (as in any other 
field of study). 



^Links with the internship site arc not possible in some projects, such as TRAC or other projects that recruit 
teachers firom wide geographic areas. 



Of the 14 items included in this categOTy”^, eight (8) earned mean priority ratings greater 
than 4.0. These ranked items arc displayed in Table 6. 

Table 6 

Priority Outcomes for Students 



As a result of having a teacher with a SWEP internship, students will 



■CST" 


rrcw 


Mean 




Stud 


D. inqrrovc skills in problem solving. 


4.39 


oil 


Stud 


K. enhance their observational & analytical skills. 


4.28 


p 

bo 


Stud 


L show increased enthusiasm and appreciation for science/math. 


TTT 




QMm] 


E. be better prepared to enter the science/ technical workforce. 


TW 


1.11 


Stud 


B. increase knowledge of careers and requirements. 


4.14 


0.91 


Stud 


C. have a greater appreciation of role of math, science and 
technology in society. 


4.08 


0.91 


Stud 


E. increase computer & technical literacy. 


4.0(j 


0.86 


Stud 


H. engage more in cooperative/ collabwative learning. 


4.00 


1.17 


Comm 


A. more students graduate and enter math/science fields. 


3.68 


1.32 



The priority items for student outcomes are fairly consistent with the priorities in the 
classroom transfer category: project priorities for student outcomes focus on improved skills in 
problem-solving, analyses, and technical literacy; priorities for classroom transfer focus on 
applying knowledge to "real-woiid" applications, promoting investigation and inquiry, and 
integrating math, science, and technology. Project directors place a high value on students' 
learning mwe about math and science careers, increasing their interest and enthusiasm in math, 
science, and technology, and gaining a better appreciation of the roles these fields play in 
society. Relatively high priorities (by some projects, at any rate) were also assigned to 
students' increasing their knowledge of the world of work and considering careers in 
math/science teaching. There is less endorsement of the goal of students' enrolling in more 
math and science classes ot becoming more involved in extracurricular math/science programs 
Gess than 3.5). And, there is relatively less value and agreement for tlie goal of having more 
students enter math/science fields after they graduate (mean = 3.68). 



^Two items reJating to student outcomes appeared in the "school/community" category, because they pertain to 
the impact of improving the technological t^nt and scientific literacy of the citizenry. They are included in 
this discussion, however, since they relate directly to expectations of student impact 
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SchoollCommunity Effects 

The final set of items focused on effects in the school environment or in the community 
(society) at large. The two items relating to students were discussed in the previous section. 

Of the remaining three items in this set, only one--"A 'critical mass' of program teachers will 
influence the climate of the school"--had a mean rating greater than 4.0. Much lower ratings 
were assigned to the following two items: "administration wiU be more involved in school- 
community partnerships" and "the public wiU become more involved with issues of education." 
Apparendy, these potential benefits of the program are too distant and tenuous to receive 
consistendy high priority ratings among the group of project directors. 

CommonaUdes in Priorities 

Our first and strongest reaction to these survey results is the remarkable consistency 
among respondents as to the highest priorities of their projects. Given the number of items, the 
overlap among them, and the various choices of wordings offered, we were skeptical about the 
degree of consensus that would be achieved. The distinctions among types of programs 
(research-based versus industry-based, for example) account for some of the discrepancies in 
ratings, but by and large there was a tremendous amount of agreement about what these 
projects aim to accomplish. This degree of consensus suggests that projects might weU profit 
from coUaborative evaluation efforts, since their basic goals and objectives are in harmony. 
Further, agreement on statements of goals and priorities provides a strong starting point for 
identifying criteria or data coUection strategies that can be used to document goal attainment 

One method of summarizing and interpreting the survey findings is to organize the 
statements in a "conceptual map" of the program model. A very simple versiem, perhaps better 
described as a "flow chart", is presented in Figure 1. In this figure, we have inserted the goals 
statements into different points, corresponding to the summer experience, the immediate 
teacher effects, intermediate effects (mainly classroom transfer), and longer-term effects for 
both teachers and students*. This organization may facilitate further discussion among project 
directors regarding: a) the irrqjlicit assumptions of how the program is expected to exert its 
influence; b) program mechanisms that are designed to facilitate the intended effects; and c) 
other ("extraneous") factors that may be influential at various points in the process. Further 
discussion and refinement of the program model is an important next step toward any effon to 
design a coUaborative evaluation effort. 

The figure, or one like it, may also be useful at the local project level-as a means of 
clarifying goals, expectations, and the linkages between intents and project mechanisms. For 
example, we administered this survey to the members of the nSME Board of Directors and to 

* A similar "program model" is included in Gottfried, et al (1992) 
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^ Figure 1 

Conceptual MoJel of “Most Important” SWEP Goals: 



PROGRAM 

implementation 



I ’ 



< • 



o 

ERIC 



IMMEDIATE 

CHANGES 



Placement 




Teacher Attitudes 


•Best possible position 




•Renewed enthusiasm for 


•Teachers successfully 




teaching 


complete assigned tasks 




•More self confident in 


•Variety of careers 




work-world skills 


Program Features 




•Higher prof self-esteem 


•Teachers receive support 




Teacher Knowledge 


for classroom transfer 
• Meetings enhance exper. 




•Gain first hand knowledge of 
industry/research culture 


Participant 




•New perspectives on educ. 


Reactions 




•Incr awareness of subject to 


• Tchrs consider program 




work applications 


high level prof develop. 




•More competent in technology 


• Mentors think program Is 
worthwhile for themselves 




Teacher Credibility 


and for teachers 




•Credible model to students of 
excitement of m/s/t 



INTERMEDIATE 

CHANGES 



ii^r 



Student ^ 
Attitudes and Behaviors 

• Enhance observational and 
analytical skills 

• Increase prob solving skills 
•Greater appreciation of role 
of m/s/l 

• Increased enthusiasm & 
appreciation for role of m/s 
•Increase knowledge of 
careers and requirements 

• Increase computer literacy 



BEST COPY AVAILABLE 



LONG-TERM 

CHANGES 



Classroom Practice 

•Use applications and examples 
from summer 

Use teamwork and coop. Irning 
• Hands on lessons 
•Promote investigation & inquiry 
•Encourage communication skills 
•Provide more “real” applications 
•Revise or add content 
•Integrate m/s/l 

Act as facilita|pr mth e r than l^t 



Teacher Roles 

•Share e);perience 
with school or 
community groups 




Student 
Behaviors 

•Better prepared for 
science/tech world 
•More enter m/s fields 




Schools/ 
Community 

•Critical mass of teachers 
\ influence sch climate > 
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the nSME Fellows during the summer 1995 session. Using Figure 1 as a model, we inserted 
the priority statements identified by teachers into one figure, those from the Board in another, 
and those agreed to by both groups in a third figure. These were then used in a Board retreat 
as a tTwhanism for Strategic planning. The survey results provided information wi the 
teachers' views of the important aspects of the experience as input into the strategic planning 
process. Further, areas of discrepancy in ratings could be explored further e.g., arc 
discrepancies the result of different perspectives, or arc there expectations that are not being 
cleariy communicated to teachers, or do teachers see benefits and values that the Board is not 
aware of? The nSME staff and Board found the process to be extremely useful in their 
strategic planning. 

Current Evaluation Strategics 

The survey also contained questions about current evaluation and data collection 
strategies used by project directors. These results will be discussed in two parts: 1) what is 
being evaluated; and 2) what data collections strategies are being used. 

Whai is being evaluated? 

As part of the "priority ratings" of the statements of intended outcomes, we asked 
project directors to indicate whether or not that outcome is formally evaluated in their project. 
There were three possible responses: "yes", "informal or anecdotal evidence only", and "no." 

In Table 7, we have aggregated all of the "top priority" (mean ratings greater than 4.0) 
goal statements from all of the categories into one ranked list We list the mean raring for 
reference, and then include the numbers of projects collecting "formal" data, "informal" data, 
or "no" data related to this outcome. 

In this table, statements for which 16 or more projects reported "formal" data collection 
strategies in place are highlighted m bold-face type. The choice of " 16" was somewhat 
I arbitrary, but chosen because it represents about half of the average number of responses 

across the items. We also realize that one person's "yes" might be another person's 
"anecdotal", but we assumed that if half of the project directors were willing to say "yes, we 
collect data on this item," then the item probably is being evaluated systematically (in some 
fashion) across projects. 
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TaWe 7 

Evaluation Data by Priority Statements 



■CST" 


imt 


Mean 


Evaluation Data 
Collected? 








Yes 


Anced 


No 


Teach 


A. gala flrst haad kaowledge of iadastry/research coltare 


4.81 


21 


8 


i 


Class 


L. ase appUcatioas & exaaaples froa saaiaer ezpcrieace. 


4.68 


20 


5 


3 


Prog 


G. Teachers receive sapport for exteadiag ezper to classrooa. 




25 


5 


1 


TOT 


L. become erediMemodd to students of excitemem with math^wacneesabj. 


4.61 


12 


12 


6 


Teach 


K. deaoastrate reaewed eathasiasa for teachiag. 


4.59 


16 


10 


4 


Prog 


1. Teachers will consider iateraship as a high level prof devel 


4.57 


22 


6 


4 


Prog 


B. Teachers adjasted weU to the deaaads of iateraship. 


4.50 


17 


lO 


5 


Class 


S. ase more teamwork and cooperative learaiag with stadeats. 


4.49 


20 


6 


6 


Prog 


A. Screening process daces teachers in best possiMe position. 


4.43 


13 


9 


5 


Teara 


M. have higha professional self esteem. 


4.43 


14 


10 


6 


TS — 


A. Mentors feel that the program is worthwhUe for teachers. 


4.45 


21 


10 


1 


Stud 


D. improve skills in problem solving. 


T35“ 


3 


9 


20 


Qass 


T. design & iinplement more haadSH>a lessons. 


4.38 


19 


8 


6 


Teach 


N. be revitalized after the summer. 


4.37 


13 


11 


5 


Class 


R. promote student investigation & iaqairy. 


A.31 


17 


5 , 


7 


Teach 


B. be more knowledgeable of raanaf or research processes. 


4.32 


19 


9 J 


1 


Teach 


O. have new perspectives oa education. 


TJT 


17 


5 


8 


Prog 


D. Orientation /other meetings will enhance internship. 


4.30 


24 


2 


5 


Teach 


I. develop activities to ase in their classroom. ! 


4.30 


24 


4 


2 


Teach! 


D. be more knowledgeable in their subject area. 


4.26 


17 


5 


8 


Stud 


K. enhance their observational & analytical skills. 


4 . 2 S 


3 


11 


18 


Class 


G. revise or add new content to lessons & labs. 


4.26 


2i 


6 


5 


Prog 


F. Teachers receive advice and support for sharing experience. 


4.22 


17 




6 




I. show increased enthusiasm and appreciadon for sekneeAnath. 


4.22 


5 


7 


20 


Comn 


E. Students will be better prepared to enter the science/ technical woricfofce. 


4.20 


2 


6 


24 


Prog 


E. Teachers are exposed to a variety of sdent & tech careers. 


4.19 


20 


6 


5 


TOT 


C. increase awareness of specific subject to work application. 




16 


8 


6 


Teach 


J. be more seif confident in work-world skills. 


TTT“ 


17 


9 


5 


Teach 


P. share experience with school personnel or community grps. 


TT5" 


16 


8 


7 


IS 


H. Teachers successfully complete the task assigned to them. 


4.17 


24 


2 


3 


Stud 


B. increase knowledge of careers and requirements. 


4.14 


3 


5 


10 


Qass 


I. integrate math, science and technology. 


4.12 


16 


10 


7 


Class 


J. provide more "bosiness/real world" applications. 


4.12 


17 


4 




T5 — 


B.. Mentors feel that the program is worthwhile for them. 


4.11 


20 


ii 


0 


Qass 


U* set more as a facilitator than a lectnren 


4.0^ 


14 


12 


7 


Stud 


C. have a greater appreciation of role of m/s/t in society. 


4.08 


4 


7 


TT 


Teach 


E. know a larger number of post secondary opportunities for students. 


4.06 


12 


10 


8 


Qass 


K. value and encourage better communication skills. 


4.06 


13 


9 


10 


Comn 


: C. A "critical mass" of program teachers will influence school climate. 


TUT" 


3 


12 


17 


Prog 


C. Increase partk of teachers of underrepresented groups. 


4.00 


21 


3 


7 


TOT 


G. be more competent in the use of technology. 


4.NT 


16 


9 


7 


Stud 


E. increase computer &. technical literacy. 




3 


8 


21 


Stud 


H. engage more in cooperative/ collaborative learning. I 4.00 


3 


10 
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BEST COPY AVAILABLE 



Two patterns are immediately evident in Table 7. The first is that the highest priority 
items are being evaluated: 28 of the 43 statements received "yes" ratings by the majority of 
responding projects. Evaluation efforts are focused on these most important objectives. As the 
mean prioritv’ ratings decrease, fewer projects report collecting formal data. This trend 
continues; of all of the items in the survey, only one with a mean rating lower than 4.0 was 
cited by half the group as a data collection item. That item was related to teacher retention. 

The second pattern is the lack of formal data collection related to student outcomes. 

Very few' programs collect any student data, and most of that tends to be rated by the project 
directors as informal or anecdotal. The problems with collecting student data are well-known: 
accessing data from school records; gaining access to schools and classrooms; identifying 
variables that would be appropriate to use across different subjects, grade levels, and student 
demographic groups; identifying or developing valid and reliable instruments; finding the time 
and resources to engage in systematic data collection in many different schools; and, 
justification of the time and resource expert, given the difficulties in establishing strong and 
direct relationships between teacher behaviors and student outcomes (e.g., GAO, 1994). 

How are evaluation data collected? 

We asked project directors to respond to a list of possible types of evaluation 
instruments, indicating whether or not they employ that method in their project and, if so, how 
valuable they view' that strategy. The value ratings were as follows: 

3 = most valuable 2 = valuable I = least valuable N = never used. 

The results of these ratings are presented in Table 8. 

Project directors are clearly collecting alot of information about their projects. Teacher 
surveys. Action Plans or technical reports, formal and informal interview's with teachers, site 
visits dunng the summer, and checks w'ith mentors are part of the repertoire of over two-thirds 
of the projects reporting. Spring follow-up sun'cys and implementation reports are the primary 
methods of assessing classroom implementation or school/classroom-based transfer of the 
experience. Talks with school personnel, classroom visits, or surveys of longer-term teacher 
behaviors (such as retention) are used less frequently. Only a verv’ few projects attemprt to 
collect any data from students. 



Table 8 

Evaluation Instruments Rank Ordered by Use 



TmUATION INSTRUMENT 


1? 

Use 


Mean 

Rating 


I. Informal interviews with teachers, mentors, company personnel 


32 


2.38 


C. Teacher survey at end of the program 


30 


2.61 


L. Action Flans or technical r^xxts of teachers 


30 


2.47 


G. Face to face interviews with teachers 


29 


TW~ 


N. 1'eacher evaluations (tf program meetings 


26 


TW 


A. 'teacher survey at entry of the program 


25 


2.16 


t. Site visits to internship by staff 


IS 


'2M~ 


F.^ientor surveys at end of summer 


■ 3 ^ 


TST 


D. Teacher survey follow-up in spring semester 


23 


TW 


M. Inroiementation reports of teachers 


''23 


2.43 


S. Talks with principk. Dept chair, school administration 


18 


1.94 


u. A template designed to profile programs 


1? 


2!o6 


B. Telephone interviews with teachers 




TT3“ 


R. Qassroom visitations 


15 


2.47 


B. Teacher survey in the middle of the program 


li 


2.27 


E. Teacher survey periodically for special topics (e.g. retention) 


■5 — 


2.38 


Q. Student surveys 


7 


1.86 


J. Student interviews 


5 


2.00 


K. Data coilectiwi on smdent performance 


3 


T2D“ 



Use does not necessarily imply value, however. While we did not explicitly define 
"value” in the survey, we feel it safe to assume that project directors responded to this rating 
according to the quality of the information they receive and/or the usefulness of the information 
in documenting effects, understanding processes, or improving activities in the project Table 
8 also displays the mean "value" ratings (with 3 as the highest and 1 as the lowest possible 
ratings; mean values were calculated based on ratings from projects who use the evaluation 
instrument). 

Project directors assigned the highest values to teacher surveys at the end of the 
program, site visits to the internship site, and mentor surveys at the end of the summer. These 
instruments focus on program implementation and immediate outcomes (to use the categories in 
Figure 1). Informal interviews with teachers, mentors, or company personnel were the most 
frequently used data collection strategy, but this strategy received a notably lower mean value 
rating than other mote comprehensive or systematic approaches (such as surveys. Action 
Plans, or implementation reports). Face-to-face interviews were seen as more valuable than 
telephone ot informal interviews; internship site visitations were rated as more valuable than 
classroom visitations. Even though 18 projects reported some interaction with school 
administrators, the information obtained from these contacts was rated very low, relative to the 
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other strategics. And among those projects that do collect student data, student performance 
data was seen as most valuable, followed by student interviews; student surveys received the 
lowest value rating of all the strategies listed. 

There were some differences between industry-based and research-based projects in 
their value ratings. Managers of industry-based projects assigned higher value to 
implementation reports than did their research-based counterparts. Managers of research-based 
projects assigned relatively higher value to teacher and mentor surveys at the end of the 
summer, and spring follow-up surveys to teachers, than did industry-based managers. The 
groups were very consistent in the high value ratings for site visits to internship sites and face- 
to-face interviews with teachers^. 

Commonalities in Evaluation Strategie s iand Concerns! 

The SWEP projects, as a group, are collecting a tremendous amount of information 
from participating teachers. Furtiier, most of the group reports that their primary objectives are 
being asscssed-at least to some extent This implies that project directors are focusing their 
resources on the most important aspects of their projects, which in turn might in^ly that all is 
runrting smoothly with respect to local evaluation* *®. 

The comments from some of the project directors suggest that they are satisfied with 
their existing evaluation strategies. Several are engaged in intensive data collection efforts, 
with multiple data sources (including classroom and student data), and (for at least one) 
longitudinal designs. Others feel that their level of involvement m program toleration affords 
them a good sense of what is and is not working and that their "audiences" (e.g.. Boards or 
governing councils) arc satisfied with the information they arc receiving.** As one respondent 
said: "My needs arc basically met by the many tools I have available to me. Although 
sometimes cumbersome, the tools do get at what the meat of the program is. Another said: 
"Our participants understand/accept the benefits of the program based on our methods of 
evaluation and repOTting." 

Throu^out the group, however, project directors (even those cited above) convey 
concerns about the lack of "hard" data on program effects-particularly, although not solely, 
with respect to students. For example, one project manager wrote that the local evaluation 



9 Some projects employ a "peer coach" in the summer-an experienced Fellow who can visit the sites, talk 
regulariy with the teachers, identi^ potwttial problems, and assist teachers in adjusting to the summer 
experience and in reflecting on ways the summer experience transfers to classroom practice. 

*®We wish, in hindsight, that we had asked that question dirttotly! 

* * We should note that this group also contains some project directors who feel the "data collection" burdens 
placed on them and their participating teachers are already too great, and that the money spent on these efforts 

might be better used in networking, disseminating, or providing more project resources at the local level. 
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audience "would probably like more hard data, which we are unable to provide." The 
following quotes reflect similar concerns: 

"We need a way to determine unambiguously hOw successful the programs arc. We 
think they are, we 'feel' they are.. .but we haven't found a way to determine that 
objectively yet" 

'We need 'data' that will be acceptable to business people that cleariy shows the value 
of the program... [and periiaps to show that qualitative data is 'good' data]." 

"Biggest issue is time required to collect and analyze the data. There is absolutely no 
doubt in my mind that evaluation is necessary and valuable. Also, in the situation I am 
in, 'soft' data, while okay, is not as valued as 'hard' data and that is much more 
difficult and time-consuming to collect" 

Our sense is that most project directors are satisfied with their current efforts to monitor 
implementation (e.g., if all's okay during the summer internship, if teachers respond well to 
project meetings and activities, if mentors are satisfied with the teacher's work and their 
participation in the project)^^ and to document immediate teacher effects (e.g., valuing the 
summer experience as high quality professional development; attitudes and reactions at the end 
of the summer experience; intents to incorporate new strategies into classroom instruction; 
feeling revitalized or mwe self-confident about their capabilities). 

There are more concerns, however, about current procedures for documenting actual 
classroom tranffer, for trying to establish causal links between the summer experience and 
teachers' subsequent classroom practice, and, of course, for gaining some insight into student 
effects. 

These concerns are legitimate and are confirmed by the list of frequently used 
evaluation strategies. The survey results indicate that most efforts are focused on "self-report" 
data finm teachers. While this is the best (and perhaps only viable) method of assessing 
teacher satisfaction with the program or teacher attitudes toward teaching and/or the summer 
experience, it is a less-defensible (though the most efficient) method of assessing other 
outcomes, such as classroom transfer. Action Plans and implementation reports arc somewhat 
more direct measures of classroom transfer, but these must be analyzed systematically or coded 
according to some clear-cut criteria if they are to yield data that can provide "harder" evidence 
on how tlie summer experience contributes to substantive improvements in math and science 
education. Establishing the criteria, reading these reports, and summarizing the information are 
tasks that take a tremendous amount of time (which project directors don't have) and a 

We should note, however, that there are "important" program implementation goals that are not being 
assessed, such as whether teachers arc placed in the "best possible position." 
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combination of expertise in the subject area, in instruction, and in qualitative analyses (which a 
projea manager may or may not have)*^. Further, in a comprehensive evaluation, one would 
hope for even more direct evidence, gathered from the classroom, to confirm and expand the 
data collected in teacher surveys and written reports. 

The lack of student data has emerged in all aspects of this survey. Some project 
directors (although perhaps fewer thajjd^^t be expected) wrote at length about the pressure to 
document student outcomes and their frustration with lack of time, resources, or valid 
measures to do so. The following four quotes arc offered as illustration; 

"Our governing council wants data on student attitudes/behavior impacts, but we work 
with teachers from 23 school districts across the country and can't collect data on 
students. Do most SWEPs get smdent data?" 

"We know (and have supportive data) that teachers and industry benefit [from the 
project]. Less apparent is the degrw of 'transfer to students (and teacher peers and 

admirustration). Means can be devised to measure student impact, however, 
bureaucracy of school administration must be gotten around. PLUS, the 'pros' need to 
quit shooting disqualifications (i.c., we know tiiat not every single variable can be 
controlled in the social sciences). However, simplistic measurements of student 
knowledge, attitude, observation, and motivation can be accomplished. If significant 
changes occur, then we can start to worry about the various variables which may skew 
findings." 

"Priority should be given to a student outcome evaluation tool I am concerned about 
the tir^cost to develop and implement reliability/credibility of instrumenL We should 
also rate curriculum development to school-to-work (careers) and national skill 
standards. The information from this survey should be used as leverage for funding 
professional design/development of effective evaluation tools. This cannot be done by 
any one SWEP." 

[Our project] realizes that standardized test scores will show little, if any, statistical 
difference following a teacher’s participation in the program. Regardless, data will be 

collected and studi«i The program feels the best evidence of student 'change' can 

be measured by 'how they vote with their feet' (Le., enrollment in science classes 
beyond the required number and level, participation in science clubs, science class 
attendance, etc.). This data is currently being collected for a program evaluation.” 



In discussing the evaluation issues that confront them, a number of project directors 
specifically raised the issues of logistics (time and resources to condua local evaluations). 
Others focused on the somewhat related issues of time, resources, and expertise needed to: a) 
develop valid and reliable measures for important project objectives; b) discern which variables 



DnSME employed a strategy of having teams of Teacher Fellows review Action Plans and select exemplary 
ones for dissemmation. It proved to be a daunting task, even for experienced Fellows. Several do 

disseminate Action Plans or classroom projects--in paper form, on networks or disks, or t^ugh teacher 
presentations. This is a no doubt valuable and very useful strategy for dissemination and for encouraging 
collegiality and networking among teachers. 
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can and cannot be "measured"; c) design a viable set of evaluation procedures that includes 
quantitative and qualitative approaches; and d) build a defensible case for a "mixed methods" 
evaluation plan and for decisions regarding how specific "effects" will and will not be 
assessed. Finally, a third (also related) methodological issue was raised regarding the "power" 
of the conclusions that any one project can draw, given relatively small sample sizes. 

These three sets of issues "come together" in the call for a multi-site collaborative 
evaluation effort While the amount of pressure being exerted for additional evaluation data 
varies across projects, the survey results suggest that project directors would consider adding 
to or replacing current evaluation strategies if newer methods met one or more of the following 
criteria: a) were part of a con^prehensive, defensible plan for assessing important objectives; b) 
focused on difficult-to-measure outcomes; ot c) improved the efficiency (e.g., logistics) of 
local data collection efforts. In other words, the project directors seemed receptive to (and in 
some cases, specifically requested) such a collaborative effort In the following section, we 
explore the viability, and some of the "pros and cons", of a multi-site evaluation. In the final 
section, we offer some suggestions on approaches, tactics, foci, and methods that might be 
considered, should such an effort proceed. 

VIABIUnY OF A COLLABORATIVE, MULTI-SITE EVALUATION 
Establishing Common Ground 

Before any collaborative effort at evaluation can proceed, there must be evidence that 
projects share similar views of what they are nying to acawnpUsh and how they intend to 
accomplish it Given that a) clusters of projects were initiated from a ewnmon funder (such as 
TRAC projects) or project model (such as nSME), b) projects have implicitly agreed to a 
common "name" (Scientific Work Experience Programs for Teachers), and c) project directors 
come together in national conferences to share their experiences and suntegies, it might 
reasonably be presumed that there is substantial common ground. Local projects, once bom, 
take on characteristics of their own, however. Over time, these local features may result in 
projects that share less than their comnoon origin may imply. And, "the devil is in the details." 
That is, while broad intents may be similar, local projects may vary so much in their 
implementation that attempts to "aggregate results" are meaningless. One need only think of 
Head Start, Title I, Follow-Through, Cities-in-Schools— or even Project 2061 -to generate 
examples of "national programs" whose ' local implementation projects" defy standardized 
evaluation procedures. Thus, even if the results of this survey "tell us what we already know”, 
confirmation of areas of common ground is an important first step. 
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The survey results do indicate a substantial amount of common ground among local 
projects. First, there is comnaonality in the contexts for evaluation. The purposes of 
evaluation, the clients and primary audiences, and the uses of evaluation data are quite similar 
across prefects. These are the first "facts" an evaluator must determine in designing an 
evaluation, and if the contexts varied too much across projects, a collaborative effort would be 
immediately doomed to failure. 

Second, there is remarkable consensus on important goals and objectives across the 
projects. We would be concerned if this consensus were only on broad, grandiose aims that 
are held by any and all math/science educaticmal programs. There are those types of statements 
in the list (e.g., "students have better appreciation for m/sA in society"), but on the whole the 
agreed-upon statements reflect a level of specificity that does: a) identify unique intents and 
procedures of the SWEP model; b) facilitate the generation of potential indicators; and c) 
accomplish a) and b), yet allow for local adaptation and variation in the project characteristics 
and actual implementation activities. 

Third, there is consistency across the projects in the scope and types of evaluation (data 
collection) strategics already in place. This suggests that there may already be a "pool" of 
instrumentation for some important objectives. These could be reviewed and streamlined for a 
collaborative evaluation effort (perhaps), thereby allowing time and resources for the 
development of procedures for those objectives not being assessed (or assessed well). 

Fourth, there is consistency in the general approach to evaluation that has been used to 
date, i.e., a "pre-ordinate" or "goals-attainment" approach. This approach may-~or may not- 
be the best one to employ for a collaborative evaluation effort (and we discuss this issue more 
in a subsequent section). But the consistency does inqrly that a consortium of project directors 
would approach the table with a common mind-set on the general trxxlel (if not the specific 
methods) of a collaborative effort 

In our qpinion, the survey results confirm that SWEP can be conceptualized as a 
program consisting of local projects M There is sufficient justification for an evaluation design 
that would result in an aggregation of results across projects and in an analyses of the 
relationship between project characteristics and program outcomes. The consensus on program 
objectives indicates that some aspects of the evaluation could focus on the pervasiveness of 
certain effects across local projects, while identifying other effects that are unique to individual 
or subgroups of projects. 



A program is a coordinated effort to address some mission or goal; projects are individual investigative, 
developmental, or implementation efforts under the program "umbrella" (see Joint Committee on Standards for 
Educational Evaluation, 1994; Madaus et al, 1992; Stevens, et al (no date)).. 
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Benefits nf a Cnllahorative Evaluation 

TTierc are at least four categories of potential benefits of a collaborative evaluation (and 
these have been iirq)lied throughout the preceding sections of this paper). The first is resource 
efficiency. Individual projects do not have the time, money, or personnel to conduct intensive 
evaluation studies or to develop and validate specific instrumentation and procedures. Every 
project manager who faces the need to evaluate must develop and implement his/her own 
procedures. A "generic" set of instruments or procedures that could be adapted for local use 
would reduce development time and "reinventing the wheel"; a collaborative effort that funded 
some data collection would free up project directors' time to focus on specific aspects of the 
local project (implementation, documentatiwi, or evaluation) that needed attention. 

The second benefit is in determining the effectiveness of the program model in a more 
defensible way. Identifying effects that persist across a variety of projects, and/or increasing 
the sample size used in analyses of effects adds "power" (substantive and statistical) difficult to 
attain in one local evaluation study. All projects could then use these results in their requests 
for funding or sponsorship and in planning new project directions or needs. 

The third benefit is in examining the relationships between project characteristics (e.g., 
number of teachers, length of internship, requirements and/or supports for classroom transfer, 
number of years teachers are allowed to participate, type and amount of follow-up, 
mcchaitisms for supporting teacher collaboration and collegiality) and types of effects (e.g., 
degree and type of change in classroom practice, sustained improvements in professional self- 
esteem or satisfactitMi). Project directors throughout the SWEP network experiment with 
various project requirements, activities, and mechanisms and could greatly benefit from some 
feedback on which project characteristics seem to best support which types of outcomes. This 
type of information cannot be easily obtained m oie local cvaluatit». 

The fourth benefit is the potential impaa on policy. Federal funding agencies (such as 
the Department of Energy and the National Science Foundation) must make decisions about 
which types of projects to fund. The recent Government Accounting Department report on 
Department of Energy educational programs is a case m point (GAO, 1994). Qting the lack of 
"hard"^^ evaluative evidence regarding teacher enhancement projects, the report all but 
recommended withdrawing funds fiom suppwt of those types of projects. Regional or 
national organizations of business leaders, partnership programs, and the like also make 
recommendations to their membership regarding the types of educational activities to support 



^^The criteria used to evaluate evaluation quality was decidedly quantitative. Evaluation methods were 
considered "strong" if they included supporting data and (when appropriate) included statistical tests with a n>30 
and a significance level of .05. 
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Potential Pro blems and PitfaUs 

It appears that there is enough consistency in goals and objectives (at a specific enough 
level to work with) to pix)ceed with discussions regarding a collaborative evaluation plan. 
However, there is certainly the potential problou of agreement on the specifics of what should 
be assessed and how. We suspect the "what" would be easier to resolve than the "how." As 
long as the consortium recognizes that everything cannot be evaluated at once, and that local 
program priorities may not be fully reflected in a multi-site plan, we would anticipate 
reasonable agreement among participating projects on a subset of goals and objectives to be 
selected. 

Decisions regarding how objectives and project processes are assessed may prove more 
troublesome. One problem is satisfactorily establishing the validity of any specific 
measurement instruments used. There arc the construct validity issues that would accompany 
instruments designed to measure teacher attitudes or beliefs, for example. And, a given 
msirument is valid <Mily in a given context, for a particular, well-defined purpose. Projects 
may not feel that a given instrument is a "valid" indicatCM’ of their inqxntant objectives, or of the 
experiences teachers have had the opportunity (and the guidance) to engage in. Another 
problem is establishing consensus on what counts as satisfactory evidence. And this problem, 
unfortunately, has its toots in the "qualitative-quantitative" debate, or the "paradigm wars" as it 
is sometimes called. 

Datta (1994) and others (e.g.. House, 1994; Yin, 1994) have argued persuasively for 
an end to the debate over whether qualitative or quantitative procedures are "better." (Qualitative 
and quantitative methods serve different purposes, address questions differently, and provide 
tufferent types of answers; the selection of methods depends on the context, the "match" 
between questions and methods, and— to a large extent-the preferences of the evaluator hired to 
conduct the evaluation. Most of us "in the field" have become comfortable with using different 
methods for different purposes, although we don't always "mix methods" well. As noted in 
some of the preceding quotes from project directors, however, sponsors of SWEPs (business 
managers, scientists, federal agencies) do have a tendency to distingmsh between "hard" and 
"soft" data, with the "softer" data presumably that of case studies, interviews, and descriptions 
of project activities. 

Datta (1994) points out that federal agencies have accepted case study data for a number 
of years, although the preference for a particular paradigm might fluctuate over time and across 
agencies. She cites, as one example, the preference for randomized and quasi-experimental 
designs at the US. Department of Education during the 1970’s, while the National Science 
Foundation education offices emphasized case studies during the same period. She also 
estimates that approximately one- third of the non-financial audits conducted by the US. 
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General Accounting Office (GAO) involve some type of case study. But, she goes on to 
explain: 

The price for accqptability within GAO for case studies is the same as it is for any 
method: an emphasis on study quality, including documentation of the basis for all 
statements and jfindings in a rep^ that can be checked independently tlnough a quality- 
assurance process called 'indexing/referencing'. Subjectivity, in the sense of using as 
data the in^nessions of the evaluatee and evaluator, does not in itself create problems 
for the agency; bias does. 'Case studies, like any oAer method GAO uses, have to meet 
two criteria of impartiality: {pniracy and lack of bias in die sense that the evaluator's 
personal, peconceived opinions about the situation do not dis^ reporting and that the 
evaluator is scrupulously even-handed in examining all sides ^ a situation' (Datta, 

1990, p. 63, cited in Datta, 1994, p. 56). 

In a similar vein, Yin (1994) cites four characteristics of "quality" that should be of 
utmost importance, regardless of the type of data collection methods used: 1) thorough 
coverage and investigation of all evidence; b) constant awareness and testing of rival 
hypotheses; c) results have significant (substantive) implications beywid the immediate work; 
and d) demonstrated depth of expertise about the subject at hand. 

From a slightly different perspective, Joseph Wholey (see Shadish, Cook, and Leviton, 
(1991) for a summary of Wholey's ideas and methods) has stressed the m^iortance of making 
practical decisions about what will count as evidence in a givoi situation. He points out that, in 
practice, decisions must be made about the allocation of resources and not all project objectives 
can be subjected to intensive data collection. In a given situation, "rough" indicators may be 
perfectly appropriate for some objectives, while other objectives (because of their importance, 
measurability, or other issues) may merit more intensive study. Wholey advocates 
involvement of the ultimate "decision-makers" in making choices about the types of evidence 
and the resource allocation that will be used in a given evaluation context 

All of this is to say that the emphasis must be <mi the quality of the design, 
implementation, and interpretation of the evaluation study-and not on deciding a priori whether 
qualitative or quantitative procedures should dominate. 

The largest and potentially more mniblesome "pitfall", in our opinion, is related to 
what Robert Stake ha^" termed the "quieting of reform" (Stake, 1986). Stake has noted that in 
many educational anc social service contexts, the potential value of the reform is squelched (or 
at least "quieted") because the outcomes are not easily measured or are not susceptible to 
quantitative indices and causal conclusions. Program operations, bent to focus most on the 
"bottom line" indicators to be used in a formal evaluation, may suffer. The emphasis on 
"scientific knowledge", to the exclusion of the "common knowledge" or insights into the 
complexity of the program held by its practitioners, may not in the end serve the program or its 
constituents well. 
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In his keynote speech at the first annual meeting of the UK Evaluation Society (1995), 
Stake also discusses the "criterion problem". i.e., the difficulty of identifying measures of 
good teaching, of quality education, or of successful student learning. Citing the pressure to 
demonstrate immediate change in student performance as a result of a classroom iimovation, he 
notes: 

"The usual finding is that the iimovation has not improved student performance, and 
that is one reason why reform is so difficult Better teaching for a few months changes 
the quality of education a very s mall amount Even better learning conditions, better 
fellow students, better support from parents, all of these improve knowledge and 

academic skill s only gradi^y We can make our classrooms better but the progress 

of diat innovation vnll seldtnn show up on our criterion tests Single-component 

changes in education seldom change the quality of education. When movement occurs, 
the pace is evolutionary, not revolutionary. So--faced with the difficulty of providing a 
proper criterion and faced with the intransigence of educational systems-we evaluators 
should be reluctant to share the enthusiasm of iimovation advocates. We should be 
reluctant to assure we will measure the good that will come of it (Stake, 1996, p. 101- 
102 ). 

Stake concluded his remarks with a call for greater emphasis on establishing the validity of 
evaluation studies and on effectively describing the activities of good teaching that we do find. 

There are, of course, more mundane (but critical) issues that must be cemsidered in a 
collaborative effort such as funding for the evaluation, to whom (and through what 
competitive mechanism) to award the evaluation contract, and how to proceed with planning 
and desigiting the evaluation. In the following section, we offer some strategies and 
approaches that might be considered by a panel charged with following through on the idea of 
collaborative evaluation. 



SOME METHODOLOGICAL AND SUBSTANTIVE ISSUES TO CONSIDER 



Many pages have been spent in this paper outlining the results of the evaluation survey 
and making a case (we hope) for the viability of a coordinated, collaborative effort to evaluate 
the SWEP program model. This was, we believe, a necessary first step and the type of 
information a task-force (or a potential evaluator) would need to pmxeed with the next steps of 
plan nin g such a coordinated effort There is a second paper that needs to be written, to 
examine more fully some of the possibilities and strategies that could be used to guide the 
evaluation design; perhaps this second paper will emerge from the 1996 national conference of 
SWEPs, or perhaps it is best written by the respondents to a request for evaluation proposals. 
Nevertheless, we offer some notes and comments on issues that we feel should be considered 
more fully m the "next phase"--whatever form that may take. 
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There is no one right strategy for conducting an evaluation. Over the past 30 years or 
so, evaliiation theorists and researchers have explicated a variety of "approaches" to evaluation 
(for example, see summaries in Patton, 1982; Shadish, Cook & Leviton, 1991; see also 
McLaughlin & Phillips, 1991). The "orientations" of the various ^)proaches (e.g., objectives- 
oriented, management-oriented, consumer-oriented, expertise-oriented, adversary-oriented, 
naturalistic and participant-oriented, to use Worthen and Sanders' (1987) classifications) 
sometimes imply a preference for relatively more emphasis on qualitative or quantitative 
procedures, but theoretically the approach does not dictate the type of data to be collected. 
Rather, selection of an approach^^ has implications for the types of evaluation questions posed, 
the uses of the evaluation results, and the relationships between the evaluator and the project 
staff. 

Often, three broad labels— goals-oriented, decision-oriented, and responsive— are 
sufficient to distinguish among approaches (Madaus, Haney & Kreitzer, 1992). In goals- or 
decision-oriented approaches, evaluation data arc collected according to a pre-established 
fiamework of variables to be assessed. Projects may be evaluated according to the extent to 
which they have attained goals (for implementation and/or for outconoes), using pre-established 
criteria for "success". Or the fiamework may be derived firom the specific types of information 
project management needs to make specific types of decisions regarding the project at a 
particular point in time (e.g., information about inputs and costs, context and process, intended 
as well as unintended outcomes). Responsive approaches focus more on understanding and 
describing the complexities of an educational activity, on "what is happening" rather than on 
"what should be happening", and on representing the needs and perspectives of the participants 
and various stakeholders. Pre ordinate t^proaches tend to employ more quantitative measures 
and statistical or cost-benefit analyses; responsive approaches tend to rely more on naturalistic, 
ethnogr^hic, and qualitative methods for data collection and analyses. 

Currently, SWEPs tend to be more "objcctives-oriented" or pre-ordinate in their 
approaches to implementation evaluations (such as the NQSE template) and outcome 
evaluations (e.g., Dubner, 1994; Gottfried et al, 1992), with reliance on a mix of quantitative 
(surveys) and qualitative (interviews, focus groups, site visits) data collection strategies. There 
arc examples of somewhat more "responsive" approaches, in descriptions of teachers' summer 
experiences or in journalistic accounts of the ways teachers apply their experiences to 
classroom practice or professional growth (e.g., the nSME "Success Stories"). 



Practicing evaluators rarely use these approaches as models to be followed "to the letter" in a given 
evaluation, but rather pick and choose elements from various approaches to fit the evaluation problem and 
context at hand. 
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In a multi-site evaluation, it is likely that a goals-oriented approach would dominate- 
appropriately, perhiqjs, since there is a clear delineation of (some) objectives and an strong 
interest (fiom project management and decision-makers) on assessing the extent to which goals 
have been attained Self-report surveys and direct measures of teacher knowledge or 
classroom practice or student performance may be combined with project descriptions, 
interviews, and vignettes of illustrative practice. Pre and post data may be collected, but we 
sec more of an emphasis on "progress toward goals" than on the use of compariswi groups. 

Responsive approaches should be given some thought, however. It appears that 
relatively Uttlc emphasis has been placed on giving voice to, or understanding the perspectives 
of, the various audiences that have a stake in tiic projects. The most obvious example is the 
schotti community. Principals, department chairs, district administrators and staff developers, 
school board members, other teachers in the school, parents, students— all have a "stake", to 
some degree, in the projects. How does "what the teacher brings back" fit into the broader 
needs of the school community? The perspectives of the teacho: fellows themselves should 
perhaps be examined in a more responsive way: what do the teachers generate as the important 
benefits of participation? Which of their professional needs arc being met, and which aren’t? 
How docs their SWEP participation fit in with other professional experiences, responsibilities, 
and demands? Freon the business and research communities, we might search for better 
understanding of how this program contributes to their goals for supporting educatitoi, and/or 
the criteria they use to determine which types of programs to support We suspect that the 
answers to some of these questions lie in the store of "common knowledge" project directors 
possess. But we also suspect that a systematic attempt to understand the perspectives of the 
various stakeholders may cast new light on program goals and priorities, on areas of program 
implcmcntaticMi and outcomes that need to be defined and explored furtiier, and on the 
implications of the program for meaningful educational reform. 

Methodological Models 

In a goal attainment approach, a multi-site evaluatimi plan would identify a core set of 
objectives that can be assessed across projects, to determine the degree and the pervasiveness 
of outcomes. To the extent that a core set of evaluation instruments (whether these are 
surveys, interviews, or other types of indicators) can be used for certain objectives, the data 
from these instruments can be aggregated across sites and thereby increase the sample size for 
the analyses. Local project characteristics can be documented and some of these used in 
analyses of the relationship between implementation variables and outcome variables. 

For some objectives, standard instruments may not be feasible or desirable. It may be 
possible, however, to adapt an approach akin to meta-analytic techniques, to combine data 
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across projects. In meta-analysis (Hedges and Olidn, 1985), techniques can be used to 
combine data on similar constructs, even when the specific methods of measuring the construct 
vary. These techniques arc also useful in identifying factors (like project charaacristics, or 
types of measurement instruments used) that might account for differences in results across 
project sites, as well as for calculating effect sizes to summarize the lelaticmships among 
variables of intcresL 

This meta-analytic mindset may be p^culariy useful for collecting student data. While 
there may well be standard approaches that are valid across sites and across teachers within 
sites for assessing certain student outcomes (attitudes toward madi/scienceAechnology, for 
example), more valid measures of what students are gaining are likely to be much more teacher 
specific. It is an integral part of the SWEP model, we contend, that teachers use the 
information anl insights they have gained in the summer in ways that they deem most 
appropriate for their given classroom situation. In other words, the teachers set the 
instructional goals that they have for themselves, as a result of the summer expcrience-and 
these goals will (and should) vary across teachers. It may be reasonable to ask teachers to 
generate evidence of student performance themselves, as part of their ongoing classroom 
instruction and assessment If the constmet can be identified in diese assessnoents, meta- 
analytic techniques may prove useful in combining different types of evidence regarding similar 
achievement or skills variables. This approach may be particularly appropriate in light of recent 
advances in alternative forms of assessment (which arc by definition "non-standard" and which 
are, by design, better indicators of problem-solving skills and other "higher-order" cognitive 
processes than more traditional forms of testing (c.g., Harmon, 1995)). 

A related evaluation methodology model is "cluster evaluation." As defined by Jenness 
and Barley (1995), cluster evaluation is 

an evaluation methodology that engages a group of projects/programs with comtmn 
or s imilar purposes in commcHi evaluation efforts to determine the impact of the set of 
projects. The evaluation provides a complex, rich data set derived to a large extent 
fiom the involvement of stakeholders in the formation of the evaluation itscE The 
processes of the cluster also enable and prepare project directors to improve their own 
evaluation skills, thoeby allowing them to become better consumers of evaluation data, 
(p. 57). 

The authors define nine major elements in this evaluation methodology: 1) wganizing the 
cluster, 2) cluster evaluation team selection; 3) setting clear expectations; 4) negotiated common 
cluster outcomes; 5) collaborative data collection; 6) regular networking conferences; 7) 



^^The authors report that this methodology was initiated by the W.K. Kellogg Foundation in the late 1980's, 
and that the Foundation has continued to support evaluation efforts employing this methoderfogy. 
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technical assistance to individual projects; 8) data analysis and interpretation; and 9) cooperative 
dissemination of results (p. 60). They include examples of evaluations of science education 
refcBTO efforts to illustrate these elements in practice. 

The cluster evaluation methodology does not dictate particular evaluation designs or 
approaches, but it does provide an organizational structure for conducting collaborative 
evaluations. The SWEP consortium already has some of the required elements of this 
methodology: the existing network forms a basis for organizing a "cluster," and there is already 
a nKxlcl of "regular networking conferences" as well as stated interests in a collaborative 
evaluation effort. And certainly, the survey results reported in this paper represent a start on 
"negotiating common cluster outcomes". 

Measuring P rogram Variables 

Techniques for documenting local project characteristics and kcplcmentation have been 
developed by individual projects and by external agencies (such as the template designed by 
NQSE). The challenge in a muli-site evaluation would be to sclea a subset of important 
variables in this category to document, and to design documentation procedures that are not 
overly burdensome to project staff. 

Different projects have experimented with various methods of documenting and 
assessing teacher effects. These methods are primarily self-report, but there arc cxair 5 )lcs of 
attempts to use more "direct" assessments of changes in teachers' knowledge or scientific 
process sldlls (Gottfried, et al, 1992), philosophical views , or self-esteem^*. While teachers’ 
self-rcpon data often indicate that teachers believe they have changed, mwe direct measures 
have failed to detect these changes (Omer, in progress). The reason may be that the measures 
are not assessing the right things; it may also be that teacher fellows rate highly on these 
measures at the outset, creating "ceiling effects" in the instruments; or it may be that the 
relatively short summer experience is not enough to yield meaningful and measurable change 
(as the 1994 GAO report contends). 

One promising approach to studying teacher effects of SWEP participation is derived 
from recent research on professional development nxxiels (Little, 1993; Little and McLaughlin, 
1993; McLaughlin et al, 1992). These research efforts have identified components of quality 
professional development opportunities and have emphasized the importance of collaboration, 
collegiality, and community among teachers. Claire Omer (in progress) is developing a plan 
for studying S WEPs in light of a model of Professional Learning Communities (PLC). These 
re search -based and theoretical frameworks may be useful in guiding new conceptualizations of 



'* References to these types of data collection activities were made by some respondents to the survey reported 
here (e.g., Nancy Roberts of Creating Lasting Links and Joanna Fox of GIFT). 



teacher effects in SWEPs and of methods for assessing these effects. [This is one area, by the 
way, that comparative designs may be feasible and appropriate]. 

Mechanisms for assessing changes in classroom practice should also be based on some 
defensible framework. The recent, well-publicized efforts in developing "standards" for math 
and science curriculum and instruction (NCTM, NSTA, Project 2061, New Standards Project, 
etc.) provide some sources for developing such a framework. There are two issues that we 
believe should be taken into account in this area, howeverr^e first is the degree to which 
projects specify their expectations with respea to classroom practice. If specific expectations 
are not conveyed, or if the project docs not have mechanisms for supporting teachers in 
meeting these expectations, it may not be reasonable to define a specific set of classroom 
practices to assess. Second, SWEP teachers may already be practicing many of the "desired" 
techniques and strategics in their classrooms; modifications may be subtle and not obvious 
enough to be detected by observation checklists or classroom learning environment surveys. 

In assessing student outcomes, it may be possible to design (or select existing) surveys 
to assess student attitudes towards or interests in math, science, and technology. At least one 
project has collected indicators of "how students vote with their feet" (see quote on page 21). 
But the survey results reparted here indicate a more widespread interest in documenting 
students' problem-solving, observational, and analytic skills. Frankly, we see no hope for 
developing "standardized" measures of these skills that would be appropriate across the subject 
areas, grade levels, and school/classroom/community contexts teacher fellows represenL The 
only approach we can think of would have to involve the teachers in designing, implementing, 
and scoring the student assessments. As we discu.ssed earlier, there may be pronuse in using 
assessment results generated by the teachers as part of their classroom based assessment 
practice. While this approach is fiaught with difficulty and is likely to be resource-intensive, it 
may be worth some preliminary pilot-testing to see if procedures could be developed. 

NEXT STEPS 

If there is continued interest in a collaborative evaluation of the SWEP model, tiie 
upcoming national conference seems an ideal time to plan next stqps. It would be very useful 
for the consortium to discuss the findings fiom this survey— to determine if there is consensus 
among the group on important objectives and ^)proaches, and to confirm the applicability of 
these findings to those projects that did not respond to the survey. A task force could perhaps 
be convened to further refine the intents, purposes, and objectives of a multi-site evaluation, to 
solicit participation from specific projects, and to explore possible sources of funding. The 
evaluation should be conducted by an outside evaluator, we believe-but one who would work 
effectively and collaboratively with the task force (or other "steering committee") throughout 
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fgpMtrimmiMgen 

SUMMARY QF RESULTS 



appendix a 



1. Do you do any formal evaluation? YES NO Developing 

27 2 3 

2. Circle the value of those instruments or methods you have used: 

1 » most valuable 2 > valuable 3 « least valuable n« never used 



EVALUATION INSTRUMENT 


Most 

Valuab 


[^aluab 


Least 1 
l^aluab 


Never 

[dJsed 


Mean 

Rating 


A. Teacher survey ^t entry of the program 


7 


15 


3 


8 


2.16 


B. Teacher survey in the middle of the program 


3 


8 


0 


10 


227 


C Teacher survey |t end of die program 


20 


10 


0 


3 


2.67 


D. Teacher survey fbIk)W“UD in spring semester 


12 


10 


1 


10 


2.48 


E. Teacher survey periodically for special topics (e.g. retention) 


4 


3 


1 


22 


238 


F. Mentor surveys at end of summer 


15 


9 


0 


7 


2.62 


G. Face to face interviews with teachers 


17 


9 


3 


3 


2.48 


R Telephone interviews with teachers 


6 


6 


4 


15 


2.12 


I. Lnformal interviews witii teachers, mentors, company personnel 


14 


16 


2 


0 


238 


J. Student interviews 


1 


3 


1 


27 


2.00 


K. Data collection on student performance 


2 


2 


1 


28 


230 


L. Action Plans or technical reports of teachers 


14 


16 


0 


4 


2.47 


M. Implementation reports of teachers 


12 


9 


2 


9 


2.43 


N. Teacher evaluations of program meetings 


13 


12 


1 


7 


2.46 


O. A template designed to profile programs 


3 


12 


2 


16 


2.06 


P. Site visits to internship by staff 


17 


7 


1 


9 


2.64 


Q. Student surveys 


2 


2 


3 


26 


136 


R. Classroom visitations 


8 


6 


1 


18 


2.47 


S. Talks with principal. Dept, chair, school administration 


5 


7 


6 


15 


1.94 



Describe any other methods or instruments you have used. 



d£ST COPY AVAILABLE 



; 'i: 



> IL 



ERIC 



1 



3. Are you required to do evaluation? 



YES 

24 



NO 

11 



By whom? 

Fxmding Agencies (NSF, Dept of Energy., NIH) 

Local governing board (Board of Directors, self-governing coui>dls) 
Designated outside evaluator 



What information do they want to know? 

How effectively program is addressing goals and objectives (9) 



Is program implemented according to plan (6) 

Teacher Outcomes/lmpact on teachers (8) 

Classroom Transfer (2) 

Student Outcomes (3) 

Sponsor/Mentor Satisfaction (2) 



4. Who would read an evaluation report if you wrote it? 



Funding Agency (program officer) 


(21) 


Sponsors-management and mentors 


(15) 


Intenud Staff and governing boards 


(15) 


Teacher Participants 


(7) 


Academic Colleagues 


(6) 


School Administrators 


(5) 



5. Here are some possible uses of evaluation. Rate the priority each has 
( or would have ) in evaluating your program. 

12 3 
primary purpose secondary ptirpose probably not a purpose 



PURPOSE 


1. 

prim; 


2. 

SECO? 


3. 

WOT 


MEAN 

RATIN 


A. Monitor existing program's outcomes 


34 


1 


0 


2.97 


B. Monitor new / pilot program methods or strategies 


23 


11 


1 


2.63 


C. Adjust immediate program e.g. presentations & nteetings 


22 


9 


3 


256 


D. Use as justification tor funding 


24 


9 


3 


258 


E. Use to explain why your SWEP program should be continued 


21 


9 


2 


259 


F. Use to compare your SWEP with other programs 


4 


16 


14 


1.71 


G. Provide accoimtability to others 


18 


13 


4 


2.40 



Describe any other uses you might have for evaluations. 

Recruting (sponsors, mentors, teachers) (5) 
Information to administrators and teachers (5) 

6. How do you currently use the data you collect? 



for questions / > ju, rate eacn item tm ttie toutmmg topics using tne moicatea tcaie. 

A. Rate the level of importance / primity of the following intended outcomes for your spedfic SWEP. 

5 = Highest priority: critical outcome of our program; program cannot be considered successful 
if this does not occur for most teachers 

3 = Moderate priority: desired objective of our program; would hope this occurs for many teachers 
1 = Low priority for our specific program: would be "nice" if this occured for some teachers. 

B. Have you systematically collected data to evaluate this area? 

Y= Yes 1= Informal or anecdotal N= No 

7. Institutio nal /Corporate Support 

(Please drcle one number for importance AND one letter for having collected data.) 



CAT 


ITEM 


Mean 


SD 


Yes 


Inf/ At 


No 


IS 


A. Mentors feel that the program is worthwhile for teachers. 


4.40 


0.77 


21 


10 


1 


IS 


B.. Mentors f^ that the program is worthwhile for themselves. 


4.11 


0.71 


20 


11 


0 


IS 


C. Mentors altered perception of schools and school needa 


3.63 


1.06 


13 


12 


5 


IS 


D. Mentors gain knowledge of teacher duties/ responsibilitiea 


3.66 


0.94 


13 


13 


4 


IS 


E. More institutional people are involved with education committees 
and schools 


2.86 


1.14 


4 


13 


12 


IS 


F. Institutions wiU refine networking skills regarding education. 


3.00 


152 


4 


10 


16 


IS 


G. Program board will be actively involved with education. 


2.68 


152 


2 


6 


18 


IS 


H. Teachers successfully complete the task assigned to them. 


4.17 


1.06 


24 


2 


3 


















CATEGORY TOTALS 


358 


1.19 


101 


77 


59 



8. Program Implementation 



CAT 


ITEM 


Mean 


SD 


Yes 


Inf/Ar 


No 


Prog 


A. Screening process places teachers in best possible position. 


4.43 


055 


13 


9 


9 


Prog 


B. Teachers adjusted well to the demands of internship. 


450 


056 


17 


10 


2 


Prog 


C. Increased participation of teachers of underrepresented groups. 


4.00 


1.00 


21 


3 


7 


Prog 


D. Orientation and other program meetings will enhance internship. 


450 


0.78 


24 


2 


5 


Prog 


E. Teachers are exposed to a va^ie^y of scientific & technical careers. 


4.19 


1.01 


20 


6 


5 


Prog 


F. Teachers receive advice and support for sharing experience. 


452 


0.93 


17 


8 


6 


Prog 


G. Teachers receive support for extending experience to dassroont 


4.66 


054 


25 


5 


1 


Prog 


H. Mechanisms / academies are developed to continue dialogue after 
the internship. 


3.64 


155 


14 


8 


9 


Prog 


1. Teachers will consider internship as a high level professional 
development program. 


457 


0.90 


22 


6 


4 


















CATEGORY TOTALS 


458 


0.93 


173/9 

195 


57/9 

65 


48/9 

5.3 



CATEGORY TOTALS 



9. Teadig Effect!: 



As a ntsnlt of a SWEP intemahip/ teachers will 



CAT 


Imi 


Mean 


SD 


Yes 


Inf/ At 


No 


Teach 


A. gain first hand knowledRe of industrv/research culture and careers. 


4.81 


0.47 


21 


8 


2 


Teach 


B. be niore kxv>wlcd(seabfe of or research processes* 


432 


034 


19 


9 


4 


Teach 


C increase awarenesB of specific subject to tvoikapi^tion. 


4.19 


0.95 


16 


8 


6 


Teach 


D. be more knowledgieable in their subject area. 


438 


0.97 


17 


5 


8 


Teach 


E. know a laixer number of post secon^Utry opportunities for students. 


436 


033 


12 


10 


8 


Teach 


F. be more active with email and on the Internet 


3.41 


1.17 


12 


7 


12 


Teach 


G. be more competent in the use of teduioloev. 


4.00 


0.79 


16 


9 


7 


Teach 


H. increase the use of supplemental material aird outside resources. 


335 


033 


12 


11 


8 


Teach 


I. develop activities to use in their classroom. 


430 


1.13 


24 


4 


2 


Teach 


J. be more self confident in woric-world skills. 


4.19 


0.98 


17 


9 


5 


Teach 


K. demonstrate renewed enthusiasm for teaching. 


439 


036 


16 


10 


4 


Teach 


L. become credible ntodel to students of excitement with m/s subjects. 


431 


039 


12 


12 


6 


Teach 


M. have higher professional self esteem. 


4.43 


0.98 


14 


10 


6 


Teach 


N. be revitalized after the sununer. 


437 


034 


13 


11 


5 


Teach 


O. have new perspectives on education. 


431 


131 


17 


5 


8 


Teach 


P. share experieiKe with school personnel or community groups. 


4.19 


036 


16 


8 


7 


Teach 


Q assume new leadership roles in schocd or district. 


3.77 


136 


13 


9 


8 


Teach 


R. conduct in-service courses related to internship. 


338 


138 


11 


10 


10 


Teach 


S. be retained in the teaching force. 


339 


1.45 


16 


4 


10 


Teach 


T. continue with even more professional development 


331 


1.08 


7 


13 


11 


Teach 


U. become involved in school reform outside their own classrooms. 


3.19 


137 


9 


8 


14 
















Teach 


CATEGORY TOTALS 


4.05 


1.09 


310/21 

14.76 


180/21 

837 


151/21 

7.19 



in riaMmnm Effacte 



Asarei 

CAT 


nut oc a oTTur mvenmaip, mcncn wiu 
ITEM 


Mean 


SD 


Yes 


Inf/ At 


No 


Class 


A. invite ntanlMi and speakers to adtools. 


3.63 


1.11 


12 


12 


8 


Class 


B. take students on Add trip to internship site. 


3A6 


137 


12 


12 


8 


Class 


C take students on more field trips to industry and lab sites. 


3.11 


1.18 


7 


13 


11 


Haas 


D. receive materials or equipment from lab or industry. 


3.26 


1.11 


14 


11 


7 


Cass 


E. provide activities that stittoglhenadwol-industry pa 


3A7 


134 


10 


12 


10 


Class 


F. hKreaae emphasb on work habits such as punctuality, dependabilit) 
meetiitK deadlines, Ac pn^essioialisin. 


,336 


132 


8 


11 


13 


Class 


G. revise or add new content to lessons & labs. 


436 


036 


21 


6 


5 


Class 


H. inclikle lessons on science careers Ic lob requirements. 


334 


138 


12 


8 


13 


Glass 


I. integrate mattwsdenoe and technology. 


4.12 


137 


16 


10 


7 


Class 


I. provide mote "business/ieal world* applications. 


4.12 


133 


17 


4 


11 


Class 


K. value and encoursi^ better communication skills. 


AJ06 


1.15 


13 


9 


10 


Gass 


L. use applications dc examples from summer experience. 


438 


034 


20 


5 


3 


Class 


M. cover fewer topics but in more depth. 


339 


133 


6 


8 


17 


Class 


N. require more oral reports presentations. 


339 


138 


9 


9 


13 


Class 


O. assign more formal written reports. 


271 


1.12 1 


8 


8 


16 


Class 


P. assign long term ioint prelects. 


2.94 


138 


10 


11 


13 


Class 


Q. increase use of computers 4c tedmotogy in their dassroom. 


330 


036 


13 


10 


10 


Class 


R. promote student investigation 4c inquiry. 


437 


077 


17 1 


9 


7 


Class 


S. use more teamwork and cooperative leamii^ with students. 


4A9 


032 


20 


6 


6 




T. design 4c impiement more hands-on lessons. 


438 


078 


19 


8 


6 


Class 


U. act more as a todlitator than a lecturer. 


439 


139 


14 


12 


7 
















Class 


CATEGORY TOTALS 


3.72 


131 


278 


194 


201 



11. Student Ontcomtti 



As a result of having a teacher with a SWEP internship, students will 



CAT 


ITEM 


Mean 


SD 


Yes 


Inf/Ar 


No 


Stud 


A. increase respect for teachers and teachers' abilities. 


333 


134 


2 


7 


22 


Stud 


B. increase knowled^ of careers and requirements. 


4,14 


0.91 


3 


9 


19 


Stud 


C. have a greater appreciation of role of math, sdenoe and technology 

in society. 


438 


0.91 


4 


7 


21 


Stud 


D. improve skflia in problem »hring. 


439 


037 


3 


9 


20 


Stud 


E. increase oornmiterdE technical literacy. 


430 


036 


3 


8 


21 


Stud 


F. increa** invoareroent in extra cunicular math/ science programs. 


3.19 


1.19 


3 


8 


22 


Stud 


G. enroll in m/t'4M^ beyond required number 4c level of difficulty. 


336 


132 


3 


7 


23 


Stud 


H. engage more in cooperative/ collaborative learning. 


430 


1.17 


3 


10 


19 


Stud 


I. show Increased enthusiasm and appreciation for science/math. 


432 


0.99 


5 


7 


20 


Stud 


I. consider more a career in math/sdcnce teaching 


3.92 


132 


4 


7 


21 


Stud 


K. enhance their observational 4c analytical skills. 


438 


031 


3 


11 


18 


Stud 


L. knowled^ of world of work; work cultures 


3.91 ^ 


1.17 


3 


7 


21 
















Stud 


CATEGORY TOTALS 


3.96 


1.07 


39 


97 


!147 



12. School / rnmmnnlty 



CAT 


ITEM 


Mean 


SD 


Yes 


Inf/Ar 


No 


Comm 


A. More students graduate and enter math/sdence fields. 


3.68 


132 


0 


1 


30 


Comm 


B. Administratian will be more involved in school- community 

partnerships. 


3.48 


1.18 


3 


3 


25 


Comm 


C. A "critical mass” of program teachers will influence the climate of 

school. 


4.03 


0.95 


3 


12 


17 


Comm 


D. The public will become more involved with issues of education. 


320 


1A5 


0 


5 


27 


Comm 


E. Students will be better prepared to enter the science/ technical 

workforce. 


420 


1.11 


2 


6 


24 
















Comm 


CATEGORY TOTALS 


3.72 


125 


8 


1? 


123 



13. What are the big issues in evaluation for )rou? Attach an addtional sheet wltii comments if necessary. 

What are )rour needs regarding evaluation? 

Which areas should be given priority? 

Describ:: your concerns and problems with focusing and implen^enting evaluations. 

Is there anything else you think should have been covered in this survey on the evaluation of SWEPs? 

Besides reporting the responses on this survey to all SWEPs, what would you like to be done with this information? 
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APPENDIX B 



List of Survey Respondents 



Royace Aikin 

Battelle, Richlent, Washington TRAC 

Allen Dallas , 

Texas STARS 

Don Beck 
Cocoa, Florida SIFT 

Florine Belanger 

San Diego, Calif. Industry FeUows 
Gert Clark 

Hoboken, New Jersey NJBISEC (TIP) 

Thomas Deans 
MESTEP 

Jay Dubner 

Columbia, NYC Summer Research Program 
Eileen Engel 

LBL, Berkeley, Calif. TRAC 

Peter Famham Bethesda, Md. 

ASBMB 

Joanna Fox 

Atlanta, Georgia GIFT 

Richard French 

Middletown, Ohio Partners for Terrific Science 

Mary Lynn Grayeski 

Tucson, Ariz. Partners in Science Research Corp. 

Diane Hageman 
Hampton, Va ATTAC 2000 

Pamela Hall 

Medford, Oregon Commy Bus. Ed. Center 

Lou Hamisch 
Argonne, lUnois TRAC 

Lisa Joss 

Golden, Colorado TRAC 

Bonnie Kaiser 

Rockefeller U. , NYC Science Outreach Program 
J.A. Kampmerer 

Rochester, NY Summer Research for HS & 
College Teachers ’ 



Carole Kubota 

Seattle, Washington U. Wash Sci/Math 

Adele Kupfcr 
CUNY, NYC STIR 

Terry Lashley 

Oak Ridge, Tenn. TRAC 

Nina Leonhardt 

Brookhaven, Uptown, NY TRAC 
Paul Markovits 

St. Louis, Missouri Tech, in Context (TIC) 
Marsha Matyas 

Rockville, Maryland Frontiers in Physiology 

Carol Mooney 
LosAlamos TRAC 

Pat Moore 

Portland, Oregon nSME 

Lesa Morris 

Boulder, Colorado Col Alliance for Science 
Claire Omer 

Seattle, College of Ed. ,U. of Washington 
Sue Rinehart 

Dayton, Ohio Wright Connection 
(GEMMA) 

Nancy Roberts 
Creating Lasting Links 

Karin Rosman 

SF Bay Area, Calif. IIS ME 

Mary Anne 

Sheline AUendale, MI Teachers in Industry 
Kaye Storm 

Santa Qara County, Calif. nSME Vision 
Brian Walentia 

Texas A & M Texas Teacher Internship 

W illiam Williams 
TUAC 

Marcy Wood 

Albuquerque, New Mexcio TRAC 



